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ABSTRACT 

A  usability  study  was  used  to  measure  user  performance  and  user  preferences  for 
a  CAVE™  immersive  stereoscopic  virtual  environment  with  wand  interfaces 
compared  directly  with  a  workstation  non-stereoscopic  traditional  CAD  interface 
with  keyboard  and  mouse.  In  both  the  CAVE™  and  the  adaptable  technology 
environments,  crystal  eye  glasses  are  used  to  produce  a  stereoscopic  view.  An 
ascension  flock  of  birds  tracking  system  is  used  for  tracking  the  user’s  head  and 
wand  pointing  device  positions  in  3D  space. 


It  is  argued  that  with  these  immersive  technologies,  including  the  use  of  gestures 
and  hand  movements,  a  more  natural  interface  in  immersive  virtual  environments 
is  possible.  Such  an  interface  allows  a  more  rapid  and  efficient  set  of  actions  to 
recognize  geometry,  interaction  within  a  spatial  environment,  the  ability  to  find 
errors,  and  navigate  through  a  virtual  environment.  The  wand  interface  provides 
a  significantly  improved  means  of  interaction.  This  study  quantitatively  measures 
the  differences  in  interaction  when  compared  with  traditional  human  computer 


interfaces. 
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This  paper  provides  analysis  via  usability  study  methods  for  Fault  Identification 
termed  as  Benchmark  4.  During  testing,  testers  are  given  some  time  to  “play 
around”  with  the  CAVE™  environment  for  familiarity  before  undertaking  a 
specific  exercise.  The  testers  are  then  instructed  regarding  tasks  to  be  completed, 
and  are  asked  to  work  quickly  without  sacrificing  accuracy.  The  research  team 
timed  each  task,  and  recorded  activity  on  evaluation  sheets  for  Fault 
Identification  Test.  At  the  completion  of  the  testing  scenario  involving  Fault 
Identification,  the  subject/testers  were  given  a  survey  document  and  asked  to 
respond  by  checking  boxes  to  communicate  their  subjective  opinions. 


Keywords:  Usability  Analysis;  CAVE™  (Cave  Automatic  Virtual  Environments);  Human 
Computer  Interface  (HCI);  Benchmark;  Virtual  Reality;  Virtual  Environments; 
Competitive  Comparison 


INTRODUCTION 

his  paper  is  an  extension  of  the  work  done  by  Satter  (2005)  on  Competitive 
Usability  Studies  of  Virtual  Environments  for  Shipbuilding.  The  key  difference  is 
the  use  of  a  new  immersive  environment  called  CAVE™.  The  significance  and  the  detail 
description  of  this  study  is  very  well  explained  by  Satter  (2012)  in  his  recent  paper.  Here  we  only 
present  the  details  of  this  usability  study.  The  CAVE™  was  developed  at  the  University  of 
Illinois  at  Chicago  and  provides  the  illusion  of  immersion  by  projecting  stereo  images  on  the 
walls  and  floor  of  a  room-sized  cube.  Several  users  wearing  lightweight  stereo  glasses  can  enter 
and  walk  freely  inside  the  CAVE™.  A  head  tracking  system  continuously  adjusts  the  stereo 


projection  to  the  current  position  of  the  leading  viewer.  A  CAVE™  and  wand  system  schematic 
is  shown  in  Figures  1  &  2. 


Figure  1:  Schematic  of  the  CAVE™  System  Figure  2:  The  Wand  Interface 

BENCHMARK  4  (FAULT  IDENTIFICATION) 

1.  Description 

In  a  typical  design  review  process,  a  design  space  is  presented  to  the  reviewer(s)  who  examine 
the  space  for  design  flaws  (faults).  The  purpose  of  this  study  is  to  help  determine  the 
applicability/usability  of  various  user  interfaces  (both  stereoscopic  and  non-stereoscopic)  in 
improving  this  process.  Based  on  the  preliminary  results  of  the  previous  Benchmark  testing,  a 
fourth  Benchmark  scenario  was  prepared  to  use  the  stereoscopic  CAVE™  environment  for  the 
location  and  identification  of  faults  within  a  design  space.  The  scenario  implemented  and 
reported  here  is  built  upon  the  operations  and  scenarios  developed  for  Benchmarks  1,  2,  and  3. 
Using  the  same  virtual  factory  space  as  used  for  Benchmark  1,  ten  distinct  design  faults  were 
injected  into  this  space  similar  to  those  prepared  for  Benchmark  2  (find/repair).  However,  the 
Benchmark  4  testing  requires  only  that  the  users  utilize  the  interface  to  locate  and  identify  as 
many  of  these  faults  as  possible  in  four  minutes.  As  with  the  previous  testing,  each  user  searches 
the  faults  utilizing  the  traditional  CAD  workstation  (non-stereoscopic  interface)  and  the 


stereoscopic  wand  interface  in  the  CAVE™  environment.  The  two  scenario  sequences  were 
randomized  (non-stereoscopic  vs.  CAVE™)  and  users  were  randomly  assigned  to  start  with 
either  the  non-stereoscopic  interface  or  in  the  CAVE™  environment. 

As  each  user  progressed  through  the  active  scenario/environment  locating  and  identifying  faults, 
the  specific  fault  and  the  elapsed  time  was  recorded  for  the  analysis.  Although  this  method 
provides  a  significant  quantity  of  data,  for  Benchmark  4,  the  key  metric  for  comparison  was  the 
total  number  of  faults  found  in  each  environment. 

This  exercise  (Benchmark  4)  was  repeated  in  each  of  the  two  environments  under  test  and  the 
User  Survey  administered  to  each  user  after  each  pass  in  each  environment.  As  with  the  other 
Benchmark  testing,  sequencing  of  the  testers  through  the  two  environments  was  randomized  so 
that  not  all  of  the  users  were  testing  the  same  interface  in  the  same  order.  This  randomization 
was  used  to  eliminate  bias  in  the  testing. 

2.  Benchmark  4,  Pass  3,  faults  count  Analysis: 

The  following  is  a  presentation  of  the  Benchmark  4,  pass  3;  faults  count  analysis  for  all  the  users. 
Pass  3  results  are  presented  here  as  representative  of  user  best-final  case  results.  All  other  results 
are  presented  in  Appendix  D  [3] 

Figure  3  presents  the  user’s  ability  to  find  faults  in  a  span  of  four  minutes  in  each  of  the  two 
environments.  The  results  clearly  indicate  a  higher  fault  count  using  the  stereoscopic  CAVE™ 
environment.  In  CAVE™,  users  on  an  average  located  9.17  or  9  out  of  10  faults  in  a  span  of  4 
minutes.  On  the  other  hand,  in  workstation,  users  on  an  average  located  7.1  or  7  out  of  10  faults 


in  a  span  of  4  minutes. 
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Figure  3:  B4p3  Faults  Count 


B4P3 

#  Users 

Mean 

St.  Dev. 

Low 

High 

P  Value 

Normal? 

CV 

Cave 

30 

9.17 

0.7 

8 

10 

<0.10 

No 

1% 

W/S 

30 

7.1 

0.66 

6 

8 

<0.10 

No 

1% 

Homogeneity  of  Variance 

Test  for  Differences 

Levene's  Test 

Equal 

Var? 

Mann-Whitney  Test 

Equal? 

Significant? 

F-Value 

Pr  >  F 

Value 

Pr  >  T 

Cave  vs  W/S 

0.26 

0.61 

Yes 

6.53 

<0.001 

No 

Cave 

Table  1:  B4p3  Faults  Count  Statistics 


3.  B4p3-  Benchmark  4  Pass  3  Descriptive  Statistics 


Table  1  (Benchmark  4  pass  3  faults  count  /  B4p3)  presents  the  results  of  the  descriptive  statistics 
analysis  of  user’s  pass  3  faults  count  in  the  two-test  environment.  The  K.S.  test  is  used  to  test  for 
normality  of  data.  Since  the  P  value  is  less  than  0.1,  the  data  are  not  normal.  The  Levene’s  test 
to  test  for  equal  variance  was  then  used.  Since  the  P  value  is  greater  than  0.1  the  data  have  equal 
variance.  Since  the  data  are  not  normal,  Mann  Whitney  test  is  used.  With  the  Mann  Whitney 
test,  P  value  is  less  than  0.1,  which  indicates  that  medians  are  unequal  for  CAVE™  and 
workstation.  Examination  of  these  results  shows  that  for  the  two  environments,  the  differences 
are  statistically  significant.  The  conclusion  then  is  that  at  the  90%  confidence  level,  there  is 
significant  evidence  to  support  the  alternative  hypothesis  (Ha).  Thus,  since  the  stereoscopic  wand 
environment  demonstrates  faster  faults  count,  CAVE™  is  statistically  “better”  than  non¬ 
stereo  scopic  workstation  environment  for  Benchmark  4  during  pass  3. 

4.  Benchmark  4  passes  3  Overall  Impressions  Ratings  Analysis: 

Figure  4  (Benchmark  4  pass  3  Overall  Impressions  Ratings  /  B4p30vr)  graphically  presents 
comparisons  of  the  Benchmark  4  (faults  count)  pass  3  overall  ratings  of  the  two  environments. 
Inspection  of  the  average  ratings  shows  that  users  preferred  the  stereoscopic  environment 
(CAVE™)  over  the  non-stereoscopic  environment  (workstation). 

5.  Detailed  Statistical  Analysis 

The  following  sections  present  a  detailed  statistical  analysis  of  user  overall  impressions  ratings 
of  the  two  test  environments  following  their  3rd  and  final  pass  of  the  Benchmark  4  scenario.  All 
other  results  are  presented  in  Appendix  D  [3].  The  statistical  analysis  of  these  ratings  provides 


insight  into  the  final  opinions  of  the  users.  As  discussed  before,  the  NCSS  software  package  was 
used  to  perform  each  analysis.  Each  set  of  user  overall  impressions  ratings  is  first  examined  to 
determine  if  the  data  are  normally  distributed  (Gaussian  distribution)  using  the  KS  statistic.  The 
descriptive  statistics  test  results  are  presented  in  tabular  form  followed  by  the  results  of  Levene’s 
test  for  equal  variance  of  the  data.  The  null  hypothesis  (Ho)  and  alternative  hypothesis  (Ha) 
discussed  for  Benchmark  1  statistical  analysis  testing  applies  here  (Benchmark  4)  as  well. 

6.  Benchmark  4  Pass  3  Overall  Impressions  Ratings  Statistics 

Table  2  presents  the  results  of  the  descriptive  statistics  analysis  of  user’s  Benchmark  4  pass  3 
overall  impressions  of  the  interface.  The  K.S.  test  is  used  to  test  for  normality  of  data.  Since  the 
P  value  is  less  than  0.1  for  workstation  and  the  CAVE™,  the  data  are  not  normal.  Levene’s  test 
is  used  to  test  for  equal  variance;  since  the  P  value  is  greater  than  0.1  the  data  have  equal 
variance.  Since  the  data  are  not  normal,  Mann  Whitney  test  is  used.  But  with  Mann  Whitney 
test,  P  value  is  less  than  0.1,  which  indicates  that  medians  are  unequal  for  the  CAVE™  and 
workstation.  Examination  of  these  results  shows  that  for  the  two  environments,  the  differences 
are  statistically  significant.  The  conclusion  then  is  that  at  the  90%  confidence  level,  there  is 
significant  evidence  to  support  the  alternative  hypothesis  (Ha).  This  proves  that  the  CAVE™ 
environment  is  preferred  over  workstation  environment  in  Benchmark  4  pass  3  overall 


impressions  subjective  ratings. 
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Figure  4:  B4p30vr  Overall  Impressions  Ratings 


B40P3 

#  Users 

Mean 

St.  Dev. 

Low 

High 

P  Value 

Normal? 

CV 

Cave 

30 

4.65 

0.2 

4 

5.00 

<0.10 

No 

4.00% 

W/S 

30 

4.36 

0.23 

3.8 

4.60 

<0.10 

No 

5.00% 

Homogeneity  of  Variance 

Test  for  Differences 

Levene's  Test 

Equal 

Var? 

Mann-WhitneyTest 

Equal? 

Significant? 

F-Value 

P  Value 

Value 

P  Value 

Cave  vs  W/S 

0.01 

0.99 

Yes 

-4.69 

<0.001 

No 

Cave 

Table  2:  B4p30vr  Overall  Impressions  Ratings  Statistics 
B4  Pass  to  Pass  Comparison 


Passl  to  Pass2 

Pass2  to  Pass  3 

Passl  to  Pass3 

Diff 

% 

Diff 

% 

Diff 

% 

Cave 

-0.93 

-12% 

-0.77 

-9% 

-1.7 

-23% 

W/S 

-0.93 

-16% 

-0.37 

-5% 

-1.3 

-22% 

Table  3:  B4  Pass-to-Pass  Comparison  of  Faults  Count 

Table  3  presents  pass-to-pass  comparison  of  Benchmark  4  (Faults  Count).  The  negative  values  in 
table  3  prove  that  pass  1  faults  count  was  less  than  pass  2  and  pass  2  faults  count  was  less  than 
pass  3.  For  example  a  value  of  -22%  for  Workstation  (pass  1  to  pass  3)  is  calculated  as  (5.8- 
7.1)/5.8,  where  5.8  and  7.1  represent  the  means  of  Benchmark  4  for  pass  1  and  pass  3 


respectively.  From  table  3  one  can  conclude  that  user’s  showed  more  improvement  from  pass-to- 
pass  in  the  CAVE™  than  in  workstation.  This  is  due  to  the  fact  that  users  found  the  faults  easily 
in  a  four  screen  CAVE™  than  on  a  single  screen  traditional  CAD  workstation. 

B4  Overall  Ratings  Pass  to  Pass  Comparison 


Passl  to  Pass2 

Pass2  to  Pass  3 

Passl  to  Pass3 

Diff 

% 

Diff 

% 

Diff 

% 

Cave 

0.44 

-13% 

-0.69 

-17% 

-1.13 

-32% 

W/S 

0.2 

6% 

-0.83 

-24% 

1.03 

31% 

Table  4:  B4  Overall  Impressions  Ratings  Pass  to  Pass  Comparison 

Table  4  presents  pass-to-pass  comparison  of  Benchmark  4  overall  impressions  subjective  ratings. 
The  negative  values  in  table  4  prove  that  pass  1  ratings  were  lower  than  pass  2  and  pass  2  ratings 
were  lower  than  pass  3.  For  example  a  value  of  -32%  for  CAVE™  (pass  1  to  pass  3)  is  calculated 
as  (3.52-4.65)/3.52,  where  3.52  and  4.65  represent  the  means  of  Benchmark  4  overall 
impressions  ratings  for  pass  1  and  pass  3  respectively.  From  table  4  one  can  conclude  that  the 
CAVE™  environment  is  preferred  over  workstation. 
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After-test  vnpression  of  the  overal  system 

Comments 


Figure  5:  Usability  Survey  Questionnaire  (Satter,  2005) 


7.  CONCLUSIONS 

For  Benchmark  4  (shopping  list),  the  statistics  shows  better  results  (lower  timings  and  higher 
subjective  ratings)  for  the  CAVE™  in  both  objective  and  subjective  measures  than  the 
workstation. 

The  results  presented  below  prove  the  objective  of  this  research  that  the  state  of  the  art 
Perceptual  User  Interface  or  PUI  (CAVE™  and  wand)  are  much  better,  efficient,  faster 
environment  than  the  traditional  Graphical  User  Interface  GUI  (Workstation  and  mouse), 

•  94%  of  the  results  were  in  favor  of  CAVE™  in  both  objective  and  subjective  measures. 


2/3  of  the  results  for  pass-to-pass  improvement  were  better  for  the  CAVE™  for  both 
objective  and  subjective  measures. 
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